'a 



DEC 0 2 2002 



P i f e4 



-kappa B sequence on step 63 of original application was obtained form HIV-1 
(nt 349-374, nt 9434- 9458) as shown on pages 8 and 11 of the sequence listing. 



K03455. Human immunodefic, . [gi : 1906382] 
Protein , PubMed, Taxonomy 



Related Sequences, 



LOCUS HIVHXB2CG 9719 bp ss~RNA VRL 19-AUG-1999 

DEFINITION Human immunodeficiency virus type 1 (HXB2), complete genome; 

HIVl/HTLV-IIl/LAV reference genome. 
ACCESSION K03455 M38432 
VERSION K03455.1 GI: 1906382 

KEYWORDS TAR protein; acquired immune deficiency syndrome; complete genome; 

env protein; gag protein; long terminal repeat (LTR) ; pol protein; 
polyprotein; proviral gene; reverse transcriptase; transactivator. 
SOURCE Human immunodeficiency virus type 1. 

ORGANISM Human immunodeficiency virus type 1 

Viruses; Retroid viruses; Retroviridae; Lent i virus; Primate 
1 antivirus group . 
REFERENCE 1 (bases 493 to 674; 9577 to 9718) 

AUTHORS Ratner,L., Haseltine, W. „ Patarca,R., Livak,K.J., Starcich,B., 
Josephs , S . F. , Doran,E.R. , Raf alski, J. A* , Whitehorn, E . A. , 
Baumeister,K. , lvanoff,L., Petteway, S . R. Jr., Pearson, M. L. , 
Lautenberger, J.A. , Papas, T.S., Ghrayeb, J. , Chang, N.T-, Gallo,R.C. 
and Wong-Staal, F. 
TITLE Complete nucleotide sequence of the AIDS virus, HTLV-III 

JOURNAL Nature 313 (6000), 277-284 (1985) 
MEDLINE 85111123 
PUBMED 2578615 
REFERENCE 2 (bases 1 to 653) 

AUTHORS Starcich,B., Ratner,L., Josephs, S . F. , Okamoto,T., Gallo,R.C. and 
Wong-Staal, F. 

TITLE Characterization of long terminal repeat sequences of HTLV-III 

JOURNAL Science 227 (4686), 538-540 (1985) 

MEDLINE 85090465 
REFERENCE 3 (sites ) 

AUTHORS Allan, J. S . , Coligan, J. E. , Barin,F., McLane^M* F. , Sodroski, J. G. , 
Rosen, C. A., Haseltine, W.A. , Lee, T.H. and Essex, M, 

TITLE Major glycoprotein antigens that induce antibodies in AIDS patients 

are encoded by HTLV-III 

JOURNAL Science 228 (4703), 1091-1094 (1985) 

MEDLINE 85192537 
REFERENCE 4 ( sites ) 

AUTHORS Sodroski, J. , Patarca,R., Rosen, C, Wong-Staal, F • and Haseltine, W. 

TITLE Location of the trans-activating region on the genome of human 

T-cell lymphotropic virus type III 

JOURNAL Science 229 (4708), 74-77 (1985) 

MEDLINE 85244627 
REFERENCE 5 (sites) 

AUTHORS Arya,S.K., Guo,C, Josephs, S.F. and Wong-Staal, F. 

TITLE Trans-activator gene of human T-lympho tropic virus type III 

(HTLV-III ) 

JOURNAL Science 229 (4708), 69-73 (1985) 
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85244626 

6 (sites) 

van Beveren, CP. , Coffin, J', and Hughes, S. 
Appendix B; HTLV-3/LAV genome 

(in) Weiss, R.L., Teich,N., Varmus,H, and Coffin, J. (Eds.); 
RNA TUMOR VIRUSES, SECOND EDITION, 2, Vol. 2: 1102-1123; 
Cold Spring Harbor Laboratory, Cold Spring Harbor (1985) 

7 (sites) 

Rosen, C A., Sodroski, J. G. and Haseltine, W* A. 

The location of cis-acting regulatory sequences in the human T cell 
lymphotropic virus type III (HTLV-III/LAV) long terminal repeat 
Cell 41 (3), 813-823 (1985) 
85228232 

8 (sites) 

Rabson,A.B. , Daugherty, D. F. , Venkatesan, S. , Boulukos , K . E • , 
Benn,S.I., Folks, T.M., Feorino,P. and Martin, M. A. 
Transcription of novel open reading frames of AIDS retrovirus 
during infection of lymphocytes 
Science 229 (4720), 1388-1390 (1985) 
85300515 

9 (sites) 

Allan, J. S. , Coligan, J.E. , Lee, T . H . , McLane,M.F. , Kanki r P.J., 
Groopman, J. E. and Essex, M. 

A new HTLV-III/LAV encoded antigen detected by antibodies from AIDS 
patients 

Science 230 (4727), 810-813 (1985) 
86044509 

10 (sites) 

Rosen, C A., Sodroski, J. G. , Goh r W.C, Dayton, A „ I . , Lippke,J. and 
Haseltine,W.A. 

Post-transcriptional regulation accounts for the trans-activation 
of the human T-lymphotropic virus type III 
Nature 319 (6054), 555-559 (1986) 
86118720 

11 (sites) 

di Marzo Veronese, F. , Cope-land, T.D. , DeVico„A„L>, Rahman, R. , 
Oroszlan, S. , Gallo/R.C and Sarngadharan,M. G. 

Characterization of highly immunogenic p66/p51 as the reverse 
transcriptase of HTLV-III/LAV 
Science 231 (4743), 1289-1291 (1986) 
86122937 

12 (sites) 

Kan,N.C, Franchini, G. , Wong-Staal, F. , DuBois,G,C, Robey,W,G., 
Lautenberger, J, A, and Papas, T.S. 

Identification of HTLV-III/LAV sor gene product and detection of 

antibodies in human sera 

Science 231 (4745), 1553-1555 (1986) 

86151663 

13 (sites) 

Kramer, R. A., Schaber,M. D. , Skalka,A.M., Ganguly, K. , Wong-Staal, F. 
and Reddy,E. P. 

HTLV-III gag protein is processed in yeast cells by the virus 
pol-protease 

Science 231 (4745), 1580-1584 (1986) 
86151671 

14 (sites) 

Lee, T .H. , Coligan, J. E. , Allan, J. S , , McLane,M- F. , Groopman, J. E * and 
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Essex, M„ 

A new HTLV-III/LAV protein encoded by a gene found in cytopathic 
retroviruses 

Science 231 (4745), 1546-1549 (1986) 
86151661 

15 (sites) 

Dayton, A. I., Sodroski, J. G. , Rosen, C. A., Goh,W„C. and Haseltine, W, A. 
The trans-activator gene of the human T cell lymphotropic virus 
type III is required for replication 
Cell 44 (6), 941-947 (1986) 
86161683 

16 (sites) 

Sodroski , J. , Goh,W.C, Rosen, C, Tartar, A. , Portetelle, D» , Burny,A. 
and Haseltine, W. 

Replicative and cytopathic potential of HTLV-III/LAV with sor gene 
deletions 

Science 231 (4745), 1549-1553 (1986) 
86151662 

17 (sites) 

Arya,S.K» and Gallo,R.C. 

Three novel genes of human T~lyitipho tropic virus type III: iinmune 
reactivity of their products with sera from acquired immune 
deficiency syndrome patients 

Proc. Natl. Acad, Sci. IKS. A. 83 (7 ), 2209-2213 (1986) 
86177573 

18 (sites) 

Jones , K . A . , Kadonaga , J. T. , Luciw, P . A . and T j i an , R * 

Activation of the AIDS retrovirus promoter by the cellular 

transcription factor, Spl 

Science 232 (4751), 755-759 (1986) 

86179897 

19 (sites) 

Sodroski, J*, Goh,W*C», Rosen, C, Day ton, A. , Terwilliger, E . and 
Haseltine, W. 

A second post™ transcriptional trans-activator gene required for 

HTLV-III replication 

Nature 321 (6068), 412-417 (1986) 

86230863 

20 (sites) 

Starcich,B.R., Hahn,B.H., Shaw,G*M., McNeely, P.D. , Modrow f S., 
Wolf,H., Parks, E.S., Parks, W. P., Josephs, S . F. , Gallo,R.C, and 
Wong~Staal,F, 

Identification and characterization of conserved and variable 
regions in the envelope gene of HTLV-III/LAV, the retrovirus of 
AIDS 

Cell 45 (5), 637-648 (1986) 
86218077 

21 (sites) 

Willey, R.L. , Rut ledge, R. A* , Dias,S. , Folks, T,, Theodore, T . , 
Buckler, C.E. and Martin, M. A, 

Identification of conserved and divergent domains within the 
envelope gene of the acquired immunodeficiency syndrome retrovirus 
Proc. Natl- Acad. Sci, U.S.A. 83 (14), 5038-5042 (1986) 
86259728 

22 (bases 8761 to 9060) 

Fisher,A.G., Ratner, L., Mitsuya,H., Marselle, L-M.., Harper, M.E., 
Broder,S., Gallo, R.C. and Wong-Staal, F. 
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Infectious mutants of HTLV-III with changes in the 3 T region and 
markedly reduced cytopathic effects 
Science 233 (4764), 655-659 (1986) 
86261824 

23 (sites) 

Feinberg,M.B. , Jarrett, R. F. , Aldovini,A., Gallo,R.C. and 

Wong-Staal, F. 

HTLV-III expression and production involve complex regulation at 
the levels of splicing and translation of viral RNA 
Cell 46 (6), 807-817 (1986) 
87002448 

24 (sites) 

Lightfoote,M.M* , Coligan, J. E. , Folks, T.M., Fauci,A.S,, Martin, M. A. 
and Venkatesan,S* 

Structural characterization of reverse transcriptase and 
endonuclease polypeptides of the acquired Immunodeficiency syndrome 
retrovirus 

J. Virol. 60 (2), 771-775 (1986) 
87036947 

25 (sites) 

Wright, CM., Felber,B.K., Paskalis,H, and Pavlakis, G.N, 

Expression and characterization of the trans-activator of 
HTLV-III/LAV virus 

Sax^naB 234 (4779), 988-992 (1986) 
87042788 

26 (sites) 

Terwilliger,F. , Sodroski, J. G. , Rosen, C. A. and Haseltine,W.A. 

Effects of mutations within the 3" orf open reading frame region of 
human T-cell lymphotropic virus type III (HTLV-III/LAV) on 
replication and cy t opathogeni city 
J, Virol, 60 (2), 754-760 (1986) 
87036943 

27 (sites) 

Goh,W.C, Sodroski, J.G., Rosen, C.A. and Haseltine,W. A. 
Expression of the art gene protein of human T-lymphotropic virus 
type III (HTLV-III/LAV) in bacteria 
J. Virol. 61 (2), 633-637 (1987) 
87112968 

28 (sites) 

Modrow,S>, Hahn,B.H., Shaw,G.MU, Gallo,R.C, Wong-Staal, F, and 
Wolf,H. 

Computer-assisted analysis of envelope protein sequences of seven 
human immunodeficiency virus isolates: prediction of antigenic 
epitopes in conserved and variable regions 
J- Virol. 61 (2), 570-578 (1987) 
87112954 

29 (sites) 

Muesing,M.A. , Smith, D.H. and Capon, D.J. 

Regulation of mRWA accumulation by a human immunodeficiency virus 
trans-activator protein 
Cell 48 (4), 691-701 (1987) 
87131081 

30 (sites) 

Nabel,G. and Baltimore, D. 

An inducible transcription factor activates expression of human 
immunodeficiency virus in T cells 
Nature 326 (6114), 711-713 (1987) 
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MEDLINE 87173065 

REMARK Erratum; [Nature 1990 Mar 8 ; 344 (6262) : 178 J 
REFERENCE 31 (sites) 

AUTHORS Fisher, A, G* , Ensoli r B., Ivanoff f L. , Chamberlain, M* , Petteway,S., 

Ratner f L. f Gallo,R.C. and Wong-Staal, F. 
TITLE The sor gene of HIV-1 is required for efficient virus transmission 

in vitro 

JOURNAL Science 237 (4817) , 888-893 (1987) 

MEDLINE 87292118 
REFERENCE 32 (sites) 

AUTHORS Patarca,R. , Heath, C, Goldenberg, G. J* , Rosen, C. A., Sodroski, J. G. , 
Haseltine, W.A. and Hansen, U.M. 

TITLE Transcription directed by the HIV long terminal repeat in vitro 

JOURNAL AIDS Res* Hum. Retroviruses 3 (1), 41-55 (1987) 

MEDLINE 87299195 
REFERENCE 33 (sites) 

AUTHORS Wong-Staal, F. , Chanda,R.K. and Ghrayeb,J. 

TITLE Human immunodeficiency virus: the eighth gene 

JOURNAL AIDS Res. Hum. Retroviruses 3 (1), 33-39 (1987) 

MEDLINE 87299194 
REFERENCE 34 (bases 1 to 9635; 1 to 9635) 

AUTHORS Ratner,L., Fisher, A. , Jagodzinski, L. L. , Mitsuya,H., Liou,R.S., 
Gallo,R.C. and Wong-Staal, F. 

TITLE Complete nucleotide sequences of functional clones of the AIDS 

virus 

JOURNAL AIDS Res, Hum, Retroviruses 3 (1), 57-69 (1987) 
MEDLINE 87299196 
REFERENCE 35 (bases 6225 to 8795) 

AUTHORS Reitz,M.S. Jr., Wilson, C, Naugle,C, Gallo,R.C. and 
Robert-Gurof f ,M. 

TITLE Generation of a neutralization-resistant variant of HIV-1 is due to 

selection for a point mutation in the envelope gene 

JOURNAL Cell 54 (1), 57-63 (1988) 

MEDLINE 88253426 
REFERENCE 36 (bases 7 90 to 2292) 

AUTHORS Pal,R., Reitz,M.S. Jr., Tschachler, E. , Gallo,R.C, 
Sarngadharan,M. G. and Veronese, F . D . 

TITLE Myristoylation of gag proteins of HIV-1 plays an important role in 

virus assembly 

JOURNAL AIDS Res. Hum. Retroviruses 6 (6), 721-730 (1990) 
MEDLINE 90303964 
RE FERENCE 37 (sites) 

AUTHORS Ido,E., Han,H.P., Kezdy,F.J. and Tang, J. 

TITLE Kinetic studies of human immunodeficiency virus type 1 protease and 

its active-site hydrogen bond mutant A28S 
JOURNAL J, Biol. Chem. 266 (36), 24359-24366 (1991) 
MEDLINE 92105089 
COMMENT On Mar 25, 1997 this sequence version replaced gi: 327742. 

[6] sites; tat mRNA and other transcript boundaries. [7] sites; 

tat mRNA. 

[8] sites; mRNA splice sites. 
[9] sites; 27K antigen cds . 

[5] sites; gp!60 and gp!20 coding sequences. 
[1] sites ; regulatory sequences in the LTR. 

[(in) Weiss , R. , Teich,N., Varmus,H. and Coffin, J. (Eds .) ;RNA Tumor 
Viruses, Secon] review; bases 1 to 9718 . 

[15] sites; trans-activator function and TAR sequence. [19] 
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sites; pol coding sequence . 

[22] sites; 23K sor gene product. 

[23] sites; pol NH2 -terminal region. 

[20] sites; sor 23K protein. 

[21] sites; sor 23K protein. 

[24] sites; Spl binding sites in the promoter region, [17] sites; 
acceptor and donor splice sites for tat and 27K. [10] sites; 
deletion mutants in the tat gene. 



[18] sites; env gene conserved/ varafole regions; separate entries. 

[16] sites; trs cds boundaries. 

[ 12 ] sites ; trs cds boundaries . 

[11] sites ; env gene conserved/variable regions ; separate entries . 

[26] sites; tar or transactivator target. 

[13] sites; 3 1 orf mutations. 

[14] sites; pol p34 terminus. 



[31] sites; promoter, TAR, tat-Ill mutants. 

[32] sites; envelope protein epitopes „ 

[33] sites; trs/art protein. 

[34] sites; inducible enhancer element. 

[27] revises [30] . 

[29] sites; long terminal repeat. 

[28] sites; R orf* 

[35] sites; sor. 

Sequence for [25] kindly provided in computer- readable form by 
L . Ratner, 1 9-AUG- 1986. 

The HXB2 sequence is being used as a reference genome for all the 
HIV entries because it has been derived from a demonstrably 
infectious clone. Hence not all of the 'sites 1 references above 
were concerned with this isolate. 
FEATURES Location/Qualifiers 
source 1 . . 9719 

/organism- "Human immunodeficiency virus type 1" 

/proviral 

/isolate-"HXB2" 

/db_xref="taxon: 1167 6" 

/note="HTLV-III/LAV" 
LTR 1.-634 

/note-" 5 1 LTR" 
repeat_region 454.. 551 

/note="R repeat 5 f copy" 
mRNA 455.. 9635 

/product~"HXB2 genomic mRNA" 
prim_transcript 455.. 9635 

/note~*'tat f trs, 27K subgenomic mRNA" 
intron 744.. 5777 

/note="tat, trs, 27K mRNA intron 1" 
CDS 7 90.. 22 92 

/note-"gag polyp rotein" 

/ codon_start= 1 

/protein_id-"AAB5 02 5 8 .1" 

/dbjsref^'GI : 327745" 

/translation— "MGARASVLSGGELDRWEKIRLRPGGKKKYKLKHIVWASRELERF 
AVNPGLLETSEGCRQILGQLQPSLQTGSEELRSLYNTVATLYCVHQRIEIKDTKEALD 
KIEEEQNKSKKKAQQAAADTGHSNQVSQNYPIVQNIQGQMVHQAISPRTLNAWVKWE 
EKAFSPEVIPMFSALSEGATPQDLNTMLNTVGGHQAAMQMLKETINEEAAEWDRVHPV 
HAGPIAPGQMREPRGSDIAGTTSTLQEQIGWMTNNPPIPVGEIYKRWIILGLNKIVRM 
YSPTSILDIRQGPKEPFRDYVT)RFYKTLRAEQASQEVKNWMTETLLVQNANPDCKTIL 
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KALGPAAT LE EMMT ACQGVGGP GHKARVXAEAMSQVTNS AT IMMQRGN FRNQRKI VKC 
FHCGKEGHTAHNCRAPRKKGCWKCGKEGHQMKDCTERQANFLGKIWPSyKGHPGNFLQ 
SRPEPTAPPEESFRSGVETTTPPQKQKPIDKELYPLTSLRSLFGNDPSSQ" 
CDS 2358,. 5096 

/note— "pol polyprotein (NH2-terminus uncertain) " 

/codon_start-l 

/pr o t ein_id= "AAB5 02 5 9 . 1 " 

/ db_xr e f = " GX ; 1 9 0 6 3 8 4 " 

/ trans la t ion— "MSLPGRWKPKMXGGIGGFXKVRQYDQILIEICGH 
TPWIIGRNLLTQIGCTLNFPISPXETVPWLKPGMDGPKVKQWPLTEEKIKALVEIC 
TEMEKEGKI SKIGPENPYNTPVFAIKKKDSTKWRKLVDFRELNKRTQDFWEVQLGI PH 
PAGLKKKKSVTVXDVGDAYFSVPLDEDFRKYTAFT I PS 1NNETPGI RYQYNVLPQGWK 
GS PAI FQS SMTKI LEPFRKQNPDI VI YQYMDDLYVGSDLEI GQHRTKI EELRQHLLRW 
GLTTPDKKHQKEPPFLWMGYELHPDKW1VQPIVLPEKDSWTWDIQKLVGKLNWASQI 
YPGIKVRQLCKLLRGTKALTEVIPLTEEAELELAEMREILKEPVHGVYYDPSKDLIAE 
IQKQGQGQWTYQIYQEPFKNLKTGKYARMRGAHTNDVKQLTEAVQKITTESIVIWGKT 
PKFKLPIQKETWETWWTEYWQATWIPEWEFVNTPPLVKLWYQLEKEPXVGAETFYVDG 
AANRETKLGKAGYVTNRGRQKVVTLTDTTNQKTELQAIYLALQDSGLEWIVTDSQYA 
LGIIQAQPDQSESELVNQIIEQLIKKEKVYLAWVPAHKGIGGNEQVDKLVSAGIRKVL 
FLDGIDKAQDEHEKYHSNWRAMASDFNLPPWAKEIVASCDKCQLKGEAMHGQVDCSP 
GIWQLDCTHLEGKVILVAVHVASGYIEAEVIPAETGQETAYFLLKIAGRWPVKTIHTD 
N GS N FT GAT VRAACWWAG X KQEFGI P YN P QS Q GWE SMNKE LKKI I GQVRDQAEHLKT 
AVQMAVF I HNFKRKGG I GGYS AGERI VDI I AT D I QT KELQKQI T KI QN FRVY YRDS RN 
PLWKGPAKLLWKGEGAWI QDNSDI KWPRRKAKI I RDYGKQMAGDDCVAS RQDED " 
CDS 5041.-5619 

/note— "sor 23K protein" 
/codon_start— 1 
/protein_id="AAB5Q260 , 1" 
/db_xref-"GI : 327747" 

/ 1 r ans 1 a t ion= "MENRWQVMI VWQVDRMRI RTWKS LVKHHMYVS GKARGWFYRHH Y 
ES PHPRI SSEVHI PLGDARLVITTYWGLHTGERDWHLGQGVSI EWRKKRYSTQVDPEL 
ADQLXHLY YFDCFSDSAI RKALLGHI VS PRCE YQAGHNKVGS LQYLAXjAALI T P KKI K 
PPLP SVTKLTEDRWNKPQKTKGHRGS HTMNGH " 
CDS 555 9.. 57 95 

/note™ ,e R (ORF) protein" 
/ codon_start~l 
/proteinjld=="AAB50261 .1" 
/db_xre£~"GI: 327748" 

/translation="MEQAPEDQGPQREPHNEWTLELLEELKNEAVRHFPRIWLHGLGQ 
H I YET YGDTWAGVEAI I RI LQQLL FI H FQNWVST " 
CDS join (5831. .6045, 8379. .8424) 

/note— "tat protein" 

/ codon s t a rt- 1 

/protein_id-"AAB50256. 1" 
/db_xref-"GI: 1906383" 

/translation— "MEPVDPRLEPWKHPGSQPKTACTMCYCKKCCFHCQVCFITKALG 
I S YGRKKRRQRRRAHQN S QTHQAS LS KQ PT S QP RGD P T G P KE " 
exon 5831.. 6045 

/note— "tat protein, first expressed exon" 
/number- 2 

CDS join(597Q. .6045,8379. .8653) 

/note="trs protein" 
/codon_start=l 
/protein_id="AAB50257 .1" 
/db_xref-"GI : 327744" 

/ 1 rans la t ion— "MAGRSGDSDEELI RTVRLI KLLYQSNPPPNPEGTRQARRNRRRR 
WRERQRQIHSISERILGTYLGRSAJEPVP1QLPPLERLTLDCNEDCGTSGTQGVGSPQI 
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exon 

intron 
CDS 



LVESPTVLESGTKE" 
5970. .6045 

/note="trs protein, first expressed exon" 
/number— 2 
6046. , 8378 

/note- H tat, trs, 27K mRNA intron 2" 
6225. .8795 

/note- "envelope polyprotein" 
/ codonjstart—l 
/protein_id^"AAB50262 . 1" 
/db_xref="GI : 1906385" 

/translation^'mWEKYQHLWRWGWRWGmLLGMLMICSATEKLWVTVYYGVPV 
WKEATTTLFCASDAKAYDTEVH!^?ATHACVPTDPNPQEW 

QMHEDIISLWDQSLKPCVKLTPLCVSLKCTDLKNDTNTNSSSGRMIMEKGEIKNCSFN 
ISTSIRGKVQKEYAFFYKLDIIPIDNDTTS.YKLTSCNTSVTTQACPKVSFEPIPIHYC 

APAGFAILKCMNKTFTSrGTGPCTNVSTVQCTHGIRPWSTQLLLNGSIAEEEVVIRSVN 
FTDNAKTIIVQLNTSVEINCTRPNNNTRKRIRIQRGPGRAFVTIGKIGNMRQAHCNIS 
RAKWNNTLKQI AS KLREQFGNNKTI I FKQSS GGDPEI VTHS FNCGGEFFYCNSTQLFN 
STWFNSTWSTEGSNNTEGSDTITLPCRI KQIINMWQKVGKAMYAPPISGQIRCSSNIT 
GLLLTRDGGNSNNESEIFRPGGGDMRDNWRSELYKYKWKIEPLGVAPTKAKRRVVQR 
EKRAVGI GAL FL G FLGAAGS TMGAASMT LT VQARQ L L S G I VQQQNN L L RAl EAQQHLL 
QLTWGIKQLQARIIAVERYLKDQQLLGIWGCSGKLICTTAVPWNASWSNKSLEQIWM 
HTTWMEWDREINNYTSLIHSLIEESQNQQEKKEQELLELDKWASLWNWFNITNWLWYI 
KLFIMIVGGLVGLRIVFAVLSIWRVRQGYSPLSFQTHLPTPRGPDRPEGIEEEGGER 
DRDRSlRLVNGSLALIWDDLRSLCLFSYHRLRDLLLIATrRIVELLGRRGWEALKYWWN 
LLQYW5 QELKNS AVS LLNATAI AVAEGTDRVI EWQGACRAI RH I PRRI RQGLERI LL 



exon 



CDS 



LTR 

repeat__region 
polyA__signal 



BASE COUNT 
ORIGIN 

1 



3411 a 



8379. ,8652 
/note= s "trs protein" 
/ number=3 
8379. .8424 
/note~"tat protein" 
/number =3 
8797. .9168 

/note— "27 K protein (premature termination) M 
/ codon__start-l 
/protein_id=="AAB50263 . 1" 
/db_xref^"GI : 1906386" 

/ trans lat ion™ "MGGKWSKSSVIGWPTVRERMRRAEPAADRVGAASRDLEKH GAIT 

SSNTAATNAACAWLEAQEEEEVGFPVTPQVPLRPMTYKAAVDLSHFLKEKGGLEGLIH 

SQRRQDILDLWIYHTQGYFPD" 

9086. .9719 

/note^'^' LTR" 

9540. .9636 

/note-"R repeat 3' copy" 
9612. .9617 

/note~ n HXB2 mRNA polyadenyation signal" 
1772 c 2373 g 2163 t 



tggaagggct aattcactcc caacgaagac aagatatcct tgatctgtgg atctaccaca 

61 cacaaggcta cttccctgat tagcagaact acacaccagg gccagggatc agatatccac 

121 tgacctttgg atggtgctac aagctagtac cagttgagcc agagaagtta gaagaagcca 

181 acaaaggaga gaacaccagc ttgttacacc ctgtgagcct gcatggaatg gatgacccgg 

241 agagagaagt gttagagtgg aggtttgaca gccgcctagc atttcatcac atggcccgag 

301 agctgcatcc ggagtacttc aagaactgct gacatcgagc ttgctacaag ggactttccg 

361 ctggggactt tccagggagg cgtggcctgg gcgggactgg ggagtggcga gccctcagat 

421 cctgcatata agcagctgct ttttgcctgt actgggtctc tctggttaga ccagatctga 
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481 
541 
601 
661 
721 
781 
841 
901 
961 
1021 
1081 
1141 
1201 
1261 
1321 
1381 
1441 
1501 
1561 
1621 
1681 
1741 
1801 
1861 
1921 
1981 
2041 
2101 
2161 
2221 
2281 
2341 
2401 
2461 
2521 
2581 
2641 
2701 
2761 
2821 
2881 
2941 
3001 
3061 
3121 
3181 
3241 
3301 
3361 
3421 
3481 
3541 
3601 
3661 
3721 
3781 
3841 



gcctgggagc 
tgagtgcttc 
agaccctttt 
cgaaagggaa 
caagaggcga 
aggagagaga 
aaaattcggt 
agcagggagc 
agacaaatac 
ttatataata 
aaggaagctt 
gcagcagctg 
atccaggggc 
gtagtagaag 
ggagccaccc 
atgcaaatgt 
gtgcatgcag 
ggaactacta 
gtaggagaaa 
agccctacca 
gaccggttct 
acagaaacct 
ggaccagcgg 
cataaggcaa 
atgcagagag 
gaagggcaca 
aaggaaggac 
tggccttcct 
ccaccagaag 
ccgatagaca 
tcgtcacaat 
atacagtatt 
ttggaggttt 
aagctafcagg 
tgactcagat 
aattaaagcc 
taaaagcatt 
ggcctgaaaa 
ggagaaaatt 
aattaggaat; 
tgggtgatgc 
ccatacctag 
agggatggaa 
ttagaaaaca 
ctgacttaga 
ggtggggact 
gttatgaact 
gctggactgt 
acccagggat 
aagtaatacc 
aagaaccagt 
agcaggggca 
caggaaaata 
cagtgcaaaa 
tgcccataca 
ttcctgagtg 
aagaacccat 



tctctggcta 
aagtagtgtg 
agtcagtgtg 
accagaggag 
ggggcggcga 
tgggtgcgag 
taaggccagg 
tagaacgatt 
tgggacagct 
cagtagcaac 
tagacaagat 
acacaggaca 
aaatggtaca 
agaaggcttt: 
cacaagattt 
taaaagagac 
ggcctattgc 
gtacccttca 
tttataaaag 
gcatfcctgga 
ataaaactct 
tgttggtcca 
ctacactaga 
gagttttggc 
gcaattttag 
cagccagaaa 
accaaatgaa 
acaagggaag 
agagcttcag 
aggaactgta 
aaagataggg 
agaagaaatg 
tatcaaagta 
tacagtatta 
tggttgcact 
aggaatggat 
agtagaaatt 
tccatacaat 
agtagatttc 
accacatccc 
atatttttca 
tataaacaat 
aggatcacca 
aaatccagac 
aatagggcag 
taccacacca 
ccatcctgat 
caatgacata 
taaagtaagg 
actaacagaa 
acatggagtg 
aggccaatgg 
tgcaagaatg 
aataaccaca 
aaaggaaaca 
ggagtttgtt 
agtaggagca 



actagggaac 
tgcccgtctg 
gaaaatctct 
ctctctcgac 
ctggtgagta 
agcgtcagta 
gggaaagaaa 
cgcagttaat 
acaaccatcc 
cctctattgt 
agaggaagag 
cagcaatcag 
tcaggccata 
cagcccagaa 
aaacaccatg 
catcaatgag 
accaggccag 
ggaacaaata 
atggataatc 
cataagacaa 
aagagccgag 
aaatgcgaac 
agaaatgatg 
tgaagcaatg 
gaaccaaaga 
ttgcagggcc 
agattgtact 
gccagggaat 
gtctggggta 
tcctttaacrt 
gggcaactaa 
agtttgccag 
agacagtatg 
gtaggaccta 
ttaaattttc 
ggcccaaaag 
tgtacagaga 
actccagtat 
agagaactta 
gcagggttaa 
gfcfccccttag 
gagacaccag 
gcaatattcc 
atagttatcfc 
catagaacaa 
gacaaaaaac 
aaatggacag 
cagaagttag 
caattatgta 
gaagcagagc 
tattatgacc 
acatatcaaa 
aggggtgccc 
gaaagcatag 
tgggaaacat 
aatacccctc 
gaaaccttct 



ccactgctta 
ttgtgtgact 
agcagtggcg 
gcaggactcg 
cgccaaaaat 
ttaagcgggg 
aaatataaat 
cctggcctgt 
cttcagacag 
gtgcatcaaa 
caaaa caaaa 
gtcagccaaa 
tcacctagaa 
gtgataccca 
ctaaacacag 
gaagctgcag 
atgagagaac 
ggatggatga 
ctgggatztaa 
ggaccaaagg 
caagcttcac 
ceagattgta 
acagcatgtc 
agccaagtaa 
aagattgtta 
cctaggaaaa 
gagagacagg 
tttcttcaga 
gagacaacaa 
tccctcaggt 
aggaagctct 
gaagatggaa 
atcagatact 
cacctgtcaa 
ccattagccc 
ttaaacaatg 
tggaaaagga 
ttgccataaa 
ataagagaac 
aaaagaaaaa 
atgaagactt 
ggattagata 
aaagtagcat 
atcaatacat 
aaatagagga 
atcagaaaga 
tacagcctat 
tggggaaatt 
aactccttag 
tagaactggc 
catcaaaaga 
tttatcaaga 
acactaatga 
taatatgggg 
ggtggacaga 
ccttagtgaa 
atgtagatgg 



agcctcaata 
ctggtaacta 
cccgaacagg 
gcttgctgaa 
tttgactagc 
gagaattaga 
taaaacatat 
tagaaacatc 
gatcagaaga 
ggatagagat 
gtaagaaaaa 
attaccctat 
ctttaaatgc 
tgttttcagc 
tggggggaca 
aatgggatag 
caaggggaag 
caaataatcc 
ataaaatagt 
aaccctttag 
aggaggtaaa 
agactatttt 
agggagtagg 
caaattcagc 
agtgtttcaa 
agggctgttg 
ctaatttttt 
gcagaccaga 
ctccccctca 
cactctttgg 
attagataca 
accaaaaatg 
catagaaatc 
cataattgga 
tattgagact 
gccattgaca 
agggaaaatt 
gaaaaaagac 
tcaagacttc 
atcagtaaca 
caggaagtat 
tcagtacaat 
gacaaaaatc 
ggatgatttg 
gctgagacaa 
acctccattc 
agtgctgcca 
gaattgggca 
aggaaccaaa 
agaaaacaga 
cttaatagca 
gccatttaaa 
tgtaaaacaa 
aaagactcct 
gtattggcaa 
attatggtac 
ggcagctaac 



aagcttgcct 
gagatccctc 
gacctgaaag 
gcgcgcacgg 
ggaggctaga 
tcgatgggaa 
agtatgggca 
agaaggctgt 
acttagatca 
aaaagacacc 
agcacagcaa 
agtgcagaac 
atgggtaaaa 
attatcagaa 
tcaagcagcc 
agtgcatcca 
tgacatagca 
acctatccca 
aagaatgtat 
agactatgta 
aaattggatg 
aaaagcattg 
aggacccggc 
taccataatg 
ttgtggcaaa 
gaaatgtgga 
agggaagatc 
gccaacagcc 
gaagcaggag 
caacgacccc 
ggagcagatg 
atagggggaa 
tgtggacata 
agaaafcctgt 
gtaccagtaa 
gaagaaaaaa 
tcaaaaattg 
agtactaaat 
tgggaagttc 
gtactggatg 
actgcattta 
gtgcttccac 
ttagagcctt 
tatgtaggat 
catctgttga 
ctttggatgg 
gaaaaagaca 
agtcagattt 
gcactaacag 
gagattctaa 
gaaatacaga 
aatctgaaaa 
ttaacagagg 
aaatttaaac 
gccacctgga 
cagttagaga 
agggagacta 
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3901 aattaggaaa agcaggatat gttactaata 
3961 acacaacaaa tcagaagact gagttacaag 
4 021 tagaagtaaa catagtaaca gactcacaat 
4081 atcaaagtga atcagagtta gtcaatcaaa 
4141 tctatctggc atgggtacca gcacacaaag 
4201 tagtcagtgc tggaatcagg aaagtactat 
4261 aacatgagaa atatcacagt aatztggagag 
4321 tagtagcaaa agaaatagta gccagctgtg 
4381 atggacaagt agactgtagt ccaggaatat 
4441 aagttat cct ggtagcagtt catgtagcca 
4501 cagaaacagg gcaggaaaca gcatattttc 
4561 aaacaataca tactgacaat ggcagcaatt 
4 621 ggtgggcggg aatcaagcag gaatttggaa 
4681 tagaatctat gaataaagaa ttaaagaaaa 
4741 atcttaagac agcagtacaa atggcagtat 
4801 ttggggggta cagtgcaggg gaaagaatag 
4861 aagaattaca aaaacaaatt acaaaaattc 
4 921 gaaatccact ttggaaagga ccagcaaagc 
4 981 tacaagataa tagtgacata aaagtagtgc 
5041 atggaaaaca gatggcaggt gatgafctgtg 
5101 tggaaaagtt tagtaaaaca ccatatgtat 
5161 agacatcact atgaaagccc tcatccaaga 
5221 gatgctagat tggtaataac aacatattgg 
5281 ttgggtcagg gagtctccat agaatggagg 
5341 gaactagcag accaactaat tcatctgtat 
54 01 agaaaggcct tattaggaca catagttagc 
54 61 aaggtaggat: ctctacaata cttggcacta 
5521 ccacctttgc ctagtgttac gaaactgaca 
5581 aagggccaca gagggagcca cacaatgaat 
5641 atgaagctgt: tagacatttt: cctaggattt 
5701 aaacttatzgg ggatacttgg gcaggagtgg 

57 61 tgtttatcca ttttcagaat tgggtgtcga 
5821 agagcaagaa atggagccag tagatccrtag 

58 81 gcctaaaact gcttgtacca attgctattg 
5941 tttcataaca aaagccttag gcatctccta 
6001 agctcatcag aacagtcaga ctcatcaagc 
6061 aacgcaacct ataccaatag tagcaatagt 
6121 agttgtgtgg irccatagtaa tcatagaata 
6181 caggttaatt gatagactaa tagaaagagc 
6241 aatatcagca cttgtggaga tgggggtgga 
6301 tgatctgtag tgctacagaa aaattgtggg 
6361 aggaagcaac caccactcta ttttgtgcat 
6421 ataatgtttg ggccacacat gcctgtgtac 
6481 tggtaaatgt gacagaaaat tttaacatgt 
6541 aggatataat cagtttatgg gatcaaagcc 
6601 gtgttagttt aaagtgcact gatttgaaga 
6661 gaatgataat ggagaaagga gagataaaaa 
6721 gaggtaaggt gcagaaagaa tatgcatttt 
67 81 atgatactac cagctataag ttgacaagtt: 
6841 caaaggtatc ctttgagcca attcccatac 
6901 taaaatgtaa taataagacg ttcaatggaa 
6961 aatgtacaca tggaattagg ccagtagtat 
7021 cagaagaaga ggtagtaatt agat ctgtca 
7 081 tacagctgaa cacatctgta gaaattaatt 
7141 gaatccgtat ccagagagga ccagggagag 
72 01 tgagacaagc acattgtaac attagtagag 
7261 ctagcaaatt aagagaacaa tttggaaata 




gaggaagaca aaaagttgtc accctaactg 
caatttatct agctttgcag gattcgggat 
atgcattagg aatcattcaa gcacaaccag 
taatagagca gttaataaaa aaggaaaagg 
gaattggagg aaatgaacaa gtagataaat 
ttttagatgg aatagataag gcccaagatg 
caatggctag tgatztttaac ctgccacctg 
ataaatgtca gctaaaagga gaagccatgc 
ggcaactaga ttgtacacat ttagaaggaa 
gtggatatat agaagcagaa gttattccag 
ttttaaaatt agcaggaaga tggccagtaa 
tcaccggtgc tacggttagg gccgcctgtt 
ttccctacaa tccccaaagt caaggagtag 
ttataggaca ggtaagagat caggctgaac 
tcatccacaa ttttaaaaga aaagggggga 
tagacataat agcaacagac atacaaacta 
aaaattttcg ggtttattac agggacagca 
tcctctggaa aggtgaaggg gcagtagtaa 
caagaagaaa agcaaagatc attagggatt 
tggcaagtag acaggatgag gattagaaca 
gtttcaggga aagctagggg atggttttat 
ataagttcag aagtacacat cccactaggg 
ggtctgcata caggagaaag agactggcat 
aaaaagagat atagcacaca agtagaccct 
tactttgact gtttttcaga ctctgctata 
cctaggtgtg aatatcaagc aggacataac 
gcagcattaa taacaccaaa aaagataaag 
gaggatagat ggaacaagcc ccagaagacc 
ggacactaga gcttttagag gagcttaaga 
ggctccatgg cttagggcaa catatctatg 
aagccataat aagaattctg caacaactgc 
catagcagaa taggcgttac tcgacagagg 
actagagccc tggaagcafcc caggaagtca 
taaaaagtgt tgctttcatt gccaagtttg 
tggcaggaag aagcggagac agcgacgaag 
ttctctatca aagcagtaag tagtacatgt 
agcattagta gtagcaataa taatagcaat 
taggaaaata ttaagacaaa gaaaaataga 
agaagacagt ggcaatgaga gtgaaggaga 
gatggggcac catgc-tcctt gggatgttga 
tcacagtcta ttatggggta cctgtgtgga 
cagatgctaa agcatatgat acagaggtac 
ccacagaccc caacccacaa gaagtagtat 
ggaaaaatga eatggtagaa cagatgcatg 
taaagccatg tgtaaaatta accccactct 
atgatactaa taccaatagt agtagcggga 
actgctcttt caatatcagc acaagcataa 
tttataaact tgatataata ccaatagata 
gtaacacctc agtcattaca caggcctgtc 

attattgtgc cccggctggt tttgcgattc / 

caggaccatg tacaaatgtc agcacagtac 

caactcaact gctgttaaat ggcagtctag 

atttcacgga caatgctaaa accataatag 

gtacaagacc caacaacaat acaagaaaaa 

catttgttac aataggaaaa ataggaaata 

caaaatggaa taacacttta aaacagatag 

ataaaacaat aatctttaag caatcctcag 
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7321 gaggggaccc agaaattgta acgcacagtt ttaattgtgg aggggaattt ttctactgta 
7381 attcaacaca actgtttaat agtacttggt ttaatagtac ttggagtact gaagggtcaa 
7441 ataacactga aggaagtgac acaatcaccc tcccatgcag aataaaacaa attataaaca 
7501 tgtggcagaa agtaggaaaa gcaatgtatg cccctcccat cagtggacaa attagatgtt 
7561 catcaaatat tacagggctg ctattaacaa gagatggtgg taatagcaac aatgagtccg 
7621 agatcttcag acctggagga ggagatatga gggacaattg gagaagtgaa ttatataaat 
7 681 ataaagtagt aaaaattgaa ccattaggag tagcacccac caaggcaaag agaagagtgg 
7741 tgcagagaga aaaaagagca gtgggaatag gagctttgtt ccttgggttc ttgggagcag 
7 801 caggaagcac tatgggcgca gcctcaatga cgctgacggt acaggccaga caattattgt 
7 861 ctggtatagt gcagcagcag aacaatttgc tgagggctat tgaggcgcaa cagcatctgt 
7921 tgcaactcac agtctggggc atcaagcagc tccaggcaag aatcctggct gtggaaagat 
7981 acctaaagga tcaacagctc ctggggattt ggggttgctc tggaaaactc atttgcacca 
8041 ctgctgtgcc ttggaatgct agttggagta ataaatctct ggaacagatt tggaatcaca 
8101 cgacctggat ggagtgggac agagaaatta acaattacac aagcttaata cactccttaa 
8161 ttgaagaatc gcaaaaccag caagaaaaga atgaacaaga attattggaa ttagataaat 
8221 gggcaagttt gtggaattgg tttaacataa caaattggct gtggtatata aaattattca 
8281 taatgatagt aggaggcttg gtaggtttaa gaatagtttt tgctgtactt tctatagtga 
8341 atagagttag gcagggatat tcaccattat cgtttcagac ccacctccca accccgaggg 
8401 gacccgacag gcccgaagga atagaagaag aaggtggaga gagagacaga gacagatcca 
8461 ttcgattagt gaacggatcc ttggcactta tctgggacga tctgcggagc ctgtgcctct 
8521 tcagctacca ccgcttgaga gacttactct tgattgtaac gaggattgtg gaacttctgg 
8581 gacgcagggg gtgggaagcc ctcaaatatt ggtggaatct cctacagtat tggagtcagg 
8641 aactaaagaa tagtgctgtt agcttgctca atgccacagc catagcagta gctgagggga 
8701 cagatagggt tatagaagta gtacaaggag cttgtagagc tattcgccac atacctagaa 
8761 gaataagaca gggcttggaa aggattttgc tataagatgg gtggcaagtg gtcaaaaagt 
8821 agtgtgattg gatggcctac tgtaagggaa agaatgagac gagctgagcc agcagcagat 
8881 agggtgggag cagcatctcg agacctggaa aaacatggag caatcacaag tagcaataca 
8941 gcagctacca atgctgcttg tgcctggcta gaagcacaag aggaggagga ggtgggtttt 
9001 ccagtcacac ctcaggtacc tttaagacca atgacttaca aggcagctgt agatcttagc 
9061 cactttttaa aagaaaaggg gggactggaa gggctaattc actcccaaag aagacaagat 
9121 atccttgatc tgtggatcta ccacacacaa ggctacttcc ctgattagca gaactacaca 
9181 ccagggccag gggtcagata tccactgacc tttggatggt gctacaagct agtaccagtt 
9241 gagccagata agatagaaga ggccaataaa ggagagaaca ccagcttgtt acaccctgtg 
9301 agcctgcatg ggatggatga cccggagaga gaagtgttag agtggaggtt tgacagccgc 
93 61 ctagcatttc atcacgtggc ccgagagctg catccggagt acttcaagaa ctgctgacat 
9421 cgagcttget acaagggact ttccgctggg gactttccag ggaggcgtgg cctgggcggg 
9481 actggggagt ggcgagccct cagatcctgc atataagcag ctgctttttg cctgtactgg 
9541 gtctctctgg ttagaccaga tctgagcctg ggagctctct ggctaactag ggaacccact 
9601 gcttaagcct caataaagct tgccttgagt gcttcaagta gtgtgtgccc gtctgttgtg 
9661 tgactctggt aactagagat ccctcagacc cttttagtca gtgtggaaaa tctctagca 



The oligonucleotide probe for Ap~l sequence in step 63, of original application, 
was synthesized using PMA responsive element as consensus sequence as indicated 
by the reference of Northrop et al., 1993, and adding flanking sequences. 

These two sequences which were used as probes are representative example to 
demonstrate the methodology of DNA-proteln interaction. Any other relevant 
sequence (s) can be used for this purpose as Claim 4 has been amended by the 
Examiner . . 



